SVG Image
< Back to news

17 January 2025

Potential major role for language models in the information landscape

As generative AI advances, language models are playing an increasingly critical role in combating unreliable information. Linguist Chantal van Son from Vrije Universiteit has explored how high-quality datasets and new techniques, such as the PANLI dataset, can enable reliable applications of language models.

The spread of unreliable information through generative AI can have serious consequences. However, language models, if properly trained and evaluated, offer solutions for complex questions, according to research by linguist Chantal van Son.

 

Van Son highlights a possible application: “Imagine you’re wondering whether to vaccinate your child. Language models could be used in applications that summarize claims about vaccine safety and effectiveness. They could reveal which sources advocate for vaccination, which oppose it, and what the underlying arguments are.”

 

Limitations of existing datasets
An analysis of existing datasets showed they often fall short. Many are based on artificial text and fail to account for the multiple perspectives often present in texts, such as news articles or social media posts.

 

New techniques: PANLI dataset
Van Son developed the PANLI (Perspective-Aware Natural Language Inference) dataset. Based on texts about vaccinations, it links sentences based on meaning. The dataset uniquely evaluates relationships between sentences from both the author’s perspective and cited sources. This approach opens new possibilities for understanding and leveraging subjectivity in language models.

 

Van Son believes the PANLI dataset represents a significant advancement for practical applications of language models, such as in medical decision-making or media tools.

 

Read the full article on the Vrije Universiteit website.